Performance of Speaker-independent Speech Recognisers for Automatic Recognition of Australian English

نویسندگان

  • Kollengode S. Ananthakrishnan
  • Ahmad Hashemi-Sakhtsari
  • Adam Barnes
  • Serge Bailes
چکیده

This paper investigates the performance of three speaker-independent speech recognisers (SISRs) that support continuous speech and are currently available for speaker-independent recognition of English. These speech recognisers were tested using a subset of the Australian National Database Of Spoken Language (ANDOSL) for the recognition of digits, sentences and a short paragraph under the “clean” data condition. The recognition accuracy was evaluated by two sets of tests, one set using a limited number of speakers (i.e. 13 speakers) and the other set using 75 of the speakers in the ANDOSL database. This paper describes the preparation taken to compare the recognisers operating under similar conditions. The paper does not reveal the name of the speech recognisers because the study reported here was limited. Nonetheless, the aim of the evaluation study was to ascertain which of the recognisers would be most suitable for applications in a meeting room where a sub-set of the vocabulary is selected as the active vocabulary for control of devices and for issuing commands. The variables investigated in this study were form of speech, vocabulary size, age and gender of the speaker. The contribution of this paper to speech science and technology is in its comparative evaluation of three speaker-independent speech recognisers., and in its use of a database containing utterances with Australian English accent.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Accent Speech Recognition of Afrikaans, Black and White Varieties of South African English

In this paper we investigate speech recognition performance of systems employing several accent-specific recognisers in parallel for the simultaneous recognition of multiple accents. We compare these systems with oracle systems, in which test utterances are presented to matching accent-specific recognisers, and with accent-independent systems, in which acoustic and language model training data ...

متن کامل

Impact of Vocal Tract Length Normalization on the Speech Recognition Performance of an English Vowel Phoneme Recognizer for the Recognition of Children Voices

Differences in human vocal tract lengths can cause inter speaker acoustic variability in speech signals spoken by different speakers for the same textual version and due to these variations, the robustness of a speaker independent (SI) speech recognition system is affected. Speaker normalization using vocal tract length normalization (VTLN) is an effective approach to reduce the affect of these...

متن کامل

Artificial neural networks as speech recognisers for dysarthric speech: Identifying the best-performing set of MFCC parameters and studying a speaker-independent approach

Dysarthria is a neurological impairment of controlling the motor speech articulators that compromises the speech signal. Automatic Speech Recognition (ASR) can be very helpful for speakers with dysarthria because the disabled persons are often physically incapacitated. Mel-Frequency Cepstral Coefficients (MFCCs) have been proven to be an appropriate representation of dysarthric speech, but the ...

متن کامل

Automatic modeling of user specific words for a speaker independent recognition system

The problem addressed in this paper, is the incorporation of user specific words in a speaker independent speech recognition system. No transcription is used to model the new words, modeling is based on a very small number of training utterances only. We investigated two different modeling methods. The first is intended for small vocabulary recognisers. The HMM models for the new words are enha...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006